Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Learning interaction potentials from the structure factor is frequently seen as impractical due to accuracy constraints of neutron and X-ray scattering experiments. This study reexamines this historic inverse problem using Bayesian inference and probabilistic machine learning on a Mie fluid to elucidate how measurement noise impacts the accuracy of recovered potentials. To perform reliable potential reconstruction, we recommend that scattering data must have noise smaller than 0.005 up to ∼30 Å–1 at a standard bin width 0.05 Å–1. At uncertainties below this threshold, Mie potentials can be determined within approximately ±1.3 for the repulsive exponent, ±0.068 Å for atomic size, and ±0.024 kcal/mol in well-depth with 95% confidence. These findings highlight the potential of uniting scattering and machine learning to overcome a century-old physics problem, infer local atomic forces to serve as a vital benchmark …more » « lessFree, publicly-accessible full text available December 26, 2025
-
While Bayesian inference is the gold standard for uncertainty quantification and propagation, its use within physical chemistry encounters formidable computational barriers. These bottlenecks are magnified for modeling data with many independent variables, such as X-ray/neutron scattering patterns and electromagnetic spectra. To address this challenge, we employ local Gaussian process (LGP) surrogate models to accelerate Bayesian optimization over these complex thermophysical properties. The time-complexity of the LGPs scales linearly in the number of independent variables, in stark contrast to the computationally expensive cubic scaling of conventional Gaussian processes. To illustrate the method, we trained a LGP surrogate model on the radial distribution function of liquid neon and observed a 1,760,000-fold speed-up compared to molecular dynamics simulation, beating a conventional GP by three orders-of-magnitude. We conclude that LGPs are robust and efficient surrogate models poised to expand the application of Bayesian inference in molecular simulations to a broad spectrum of experimental data.more » « less
-
Vegetation classifications on large geographic scales are necessary to inform conservation decisions and monitor keystone, invasive, and endangered species. These classifications are often effectively achieved by applying models to imaging spectroscopy, a type of remote sensing data, but such undertakings are often limited in spatial extent. Here we provide accurate, high-resolution spatial data on the keystone species Metrosideros polymorpha, a highly polymorphic tree species distributed across bioclimatic zones and environmental gradients on Hawai’i Island using airborne imaging spectroscopy and LiDAR. We compare two tree species classification techniques, the support vector machine (SVM) and spectral mixture analysis (SMA), to assess their ability to map M. polymorpha over 28,000 square kilometers where differences in topography, background vegetation, sun angle relative to the aircraft, and day of data collection, among others, challenge accurate classification. To capture spatial variability in model performance, we applied Gaussian process classification (GPC) to estimate the spatial probability density of M. polymorpha occurrence using only training sample locations. We found that while SVM and SMA models exhibit similar raw score accuracy over the test set (96.0% and 93.4%, respectively), SVM better reproduces the spatial distribution of M. polymorpha than SMA. We developed a final 2 m × 2 m M. polymorpha presence dataset and a 30 m × 30 m M. polymorpha density dataset using SVM classifications that have been made publicly available for use in conservation applications. Accurate, large-scale species classifications are achievable, but metrics for model performance assessments must account for spatial variation of model accuracy.more » « less
An official website of the United States government
